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Abstract 



We consider the problem of defining conditional objects (a|5), which 
would allow one to regard the conditional probability Pr(a|6) as a 
probability of a well-defined event rather than as a shorthand for 
Pr(a6)/ Pr(6). The next issue is to define boolean combinations of con- 
ditional objects, and possibly also the operator of further condition- 
ing. These questions have been investigated at least since the times of 
George Boole, leading to a number of formalisms proposed for condi- 
tional objects, mostly of syntactical, proof-theoretic vein. 

We propose a unifying, semantical approach, in which conditional 
events are (projections of) Markov chains, definable in the three- valued 
extension (TL|TL) of the past tense fragment of prepositional linear 
time logic (TL), or, equivalently, by three- valued counter-free Moore 
machines. Thus our conditional objects are indeed stochastic processes, 
one of the central notions of modern probability theory. 

Our model precisely fulfills early ideas of de Finetti and, more- 
over, as we show in a separate paper |]30| , all the previously proposed 
algebras of conditional events can be isomorphically embedded in our 
model. 
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1 Preliminaries and statement of the problem 

1.1 The problem of conditional objects 



Probabilistic reasoning [26| is the basis of Bayesian methods of expert sys- 
tem inferences, of knowledge discovery in databases, and in several other 
domains of computer, information, and decision sciences. The model of con- 
ditioning and conditional objects we discuss serves equally to reason about 
probabilities over a finite domain X, or probabilistic propositional logic with 
a finite set of atomic formulae. 

Computing of conditional probabilities of the form Pr(X|Yi, . . . , Yn) and, by 
extension of conditional beliefs, is well understood. Attempts of defining first 
the conditional objects of the basic form X\Y, and then defining Pr(X|y) as 
Pr((X|y)) were proposed, without much success, by some of the founders 
of probability ||2|, |6|. They were taken up systematically only about 1980. 



The development was slow, both because of logical difficulties |2^, 1£, 17] 
and even more because the computational model is difficult to construct. 
(While a\b appears to stand for a sentence 'if b then a', there is no obvious 
calculation for Pr(a|(6|c)), nor intuitive meaning for a|(6|c), {a\b) A {c\d) , and 
the like.) 

The idea of defining conditional objects was entertained by some founders 
of modern probability |§, ^, but generally abandoned since introduction 
of the measure-theoretic model. It was revived mostly by philosophers in 
1970's Ijl], ^ with a view towards artificial intelligence reasoning. Formal 



computational models came in the late 1980's and early 1990's IS, 11] 
Only a few of them have been used for few actual calculations of conditionals 
and their probabilities whose values are open to questions ]§, |ll|. 
In this paper we want to give a rigorous (and yet quite natural and intuitive) 
probabilistic and semantical construction of conditionals, based on ideas 
proposed by de Finetti over a quarter a century ago ]p. It appears that 
this single formalism contains fragments precisely corresponding to all the 
previously considered algebras of conditional events |^]. Seen as a whole, 
it can be therefore considered as their common generalisation and perhaps 
the calculus of conditionals. 

Our system consists of three layers: the logical part is a three valued ex- 
tension of the past tense fragment of propositional linear time logic, the 
computation model are three-valued Moore machines (an extension of de- 
terministic finite automata), and the probabilistic semantics is provided by 
three-valued stochastic processes, which appear to be projections of Markov 
chains. 



1.2 The main idea 



The main idea. The main idea of our approach can be seen as an attempt 
to provide a precise mathematical implementation of the following idea of 
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de Finetti [|, Sect. 5.12]: 

"In the asymptotic approach, the definition of conditional prob- 
abihty appears quite naturally; it sufRces to repeat the definition 
of probability {as the limiting frequency), taking into considera- 
tion only the trials in which the conditioning event (hypothesis) 
is satisfied. Thus, P{E\H) is simply the limit of the ratio be- 
tween the frequency of EH and the frequency of H. If the limiting 
frequency of H exists and is different from zero, the definition is 
mathematically equivalent to the compound probability theorem 
P{E\H) = P{EH)/P{H). But even if the frequency of H does 
not tend to a limit, or the limit is zero, P{E\H) can nonetheless 
exist (trivial example: P{H\H) is always equal to 1)." 

We believe that our attempt is successful: our system will have all the 
properties predicted by de Finetti, and, moreover, as we show in a separate 
paper (3^] , subsumes all the previously existing formalisms developed to deal 
with conditionals, and, finally, appears to be able to handle some well-known 
paradoxes of probability in an intuitive and yet precise manner. 

Three truth values. To be able to take into account only the trials in 
which the hypothesis is satisfied, one has to introduce a third logical value. 
Informally, if one considers two playersQ: one betting (a|6) will hold, and the 
other it will not, if in a random experiment (dice toss, coin flip) b doesn't 
hold, the game is drawn. The previous works considered it to be an evidence 
that the definition of conditionals must be necessarily based on many valued 
logics, the typical choices being three valued. 

Note however, that assigning probability to a three- valued c is something 
like squeezing it to become two-valued. For one then assumes it to be true 
Pr(c) of time and false 1 — Pr(c) of time, and the time when c has the 
third value, typically described as undefined, is lost. So, unlike most of our 
predecessors, we attempt to preserve the three-valuedness of conditionals as 
a principle, and define their probability only on the top of that. 

Bet repetitions. Now, we should allow the players to repeat their bets. 
Here, unlike most of the previous works, if the players repeat the game, we 
allow them to bet on properties of the whole sequence of outcomes, not just 
the last one. 

This is not uncommon in many random experiments, that the history of the 
bets influences the present bet somehow. 

We present three natural examples, which are natural and have a simple 
description. 

^This sounds definitely better than gamblers ;-). 
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The first possibility is that after each bet we we start over — after the result 
of the experiment is settled, the (temporal) history is started anew, the next 
experiment not taking the old results into account. 

The second is just the opposite — always the entire history, including earlier 
experiments, is taken into account. 

The third is that no repetition is allowed: after the first experiment is settled, 
its outcome is deemed to persist forever, and future trials are effectively null. 
(Regardless of each subsequent element drawn the result is always defined 
and remains the same.) 

Roughly speaking, the first choice is adopted in bridge, the second in black- 
jack and the third in Russian roulette. 

This suggests that a conditional isn't merely an experiment with three pos- 
sible outcomes. It is indeed a sequence of experiments, and the third logical 
value, often described as unknown, is often not yet known. It is clearly a 
temporal concept, and thus we are going to consider conditionals as tempo- 
ral objects. This temporal aspect is clearly of past tense type — the result 
of a bet must depend on the history (including present) of the sequence of 
outcomes, only. 

It is worth noting that there are other approaches which consider implicitly 
bet repetition in the modelling of conditionals. These include 24, 11, [27]. 



Summary. What we undertake is thus the development of a calculus of 
conditional objects identified with temporal rules, which, given a sequence 
of random elements from the underlying domain, decide after each of the 
drawn elements if the the conditional becomes defined, and if so, whether it 
is true or false. 

We stipulate that, for any reasonable calculus of conditionals, forming boolean 
combinations of conditionals, as well as iterated conditionals, amounts to 
manipulating on these rules. 

This claim is indeed well motivated: if we fail to associate such rule to a 
complex conditional object, we do not have any means to say, in a real- life 
situations, who wins the bet on this conditional and when. So to say, such 
a conditional would be nonprobabilistic, because one couldn't bet on it! 



Novelty of our approach. We would like to stress that virtually none of 
the results we prove below is entirely new. Most of them are simple exten- 
sions or reformulations of already known theorems, as the reader can verify 
in Section TA. The novelty of our approach lies almost entirely in the way 
we assemble the results to create mathematically precise representation of 
an otherwise quite clear and intuitive notion. And indeed, we feel very reas- 
sured by the fact that we didn't have to invent any new mathematics for our 
construction. Similarly the proofs we give in this paper are quite straight- 
forward. This is exactly the emergence of previously-unheard-of compli- 
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cated algebraic structures (dubbed conditional event algebras in [^), which 
prompted us to have a closer look at conditional events and search for sim- 
pler and more intuitive formalisations. Note that probabilists and logicians 
have been doing quite well without conditional events for decades, which 
strongly suggests they have had all the tools necessary to use conditionals 
in an implicit way for a long time already. To the contrary, in the emerging 
applied areas, and in particular in AI, there is a strong need to have condi- 
tional events explicitly present, and this is why we believe in the importance 
of our results. 

2 The tools 

2.1 Pre-conditionals 

Let £ = {a, 6, c, d, . . . } be a finite set of basic events, and let S be the free 
Boolean algebra generated by £, and Q the set of atoms of S. Consequently, 
S is isomorphic to the powerset of fi, and itself is isomorphic to the 
power set of £. Any element of T, will be considered as an event, and, in 
particular, £ C S. 

The union, intersection and complementation in S are denoted by aU6, aDb 
and a^, respectively. The least and greatest elements of S are denoted 
and fl, respectively. However, sometimes we use a more compact notation, 
replacing n by juxtaposition. When we turn to logic, it is customary to use 
yet another notation: a V 6, a Ah and -la, respectively. In this situation 
appears as true and as false, but 1 and 0, respectively, are incidentally 
used, as well. Generally we are quite anarchistic in our notation, as long as 
it does not create ambiguities. 

We introduce the set 3 = {0, 1, _L} of truth values, interpreted as true, false 
and undefined, respectively. The subset of 3 consisting of and 1 will be 
denoted 2. 

It follows from the discussion above that we are going to look for condi- 
tionals in the set = 3^^ of three-valued functions c from the set 0+ of 
finite nonempty sequences of atomic events from 0, into 3. We will call such 
functions pre-conditionals, since to deserve the name of conditionals they 
must obey some additional requirements. 

Sometimes it is convenient to represent such objects in two other, slightly 
different, yet equivalent forms: 

• The second representation are length-preserving mappings c+ : 

3"'' such that c+(f) is a prefix of c^{vw). The set of all such mappings 
will be denoted 

• The third representation are mappings Coo : 3°° such that if 
w, V G r2°° have a common prefix of length n, then Cqo{w) and Cqo{v) 
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have a common prefix of length n, too. The set of aU such mappings 
will be denoted TQqo- 

On the set fi"^ U 0°° one has the natural partial order relation of being a 
prefix. Suprema of sets in this partial order are denoted by |J • 
In general, c, c+ and Cqo denote always three representations of the same pre- 
conditional, and the subscript (or its lack) indicates what representation we 
take at the moment, and we choose it according to what is most convenient. 
The three representations c, c+ and Cqo are related by the equalities 



c(a;i . . . 


UJn) 


= last-letter-of(c+(u;i . . . uJn)), 




c(a;i . . . 


t^n) 


= nth-letter-of(coo(wi . . . a;„ . . . 


)), 


c+{lji . . . 


UJn) 


= c{uJl)c{uJlUJ2) . . . C{UJI . . . UJn), 




c+{uji . . . 




= first-n-letters-of(coo(wi ■ ■ - uJn 




Coo(u;i ...UJn 


...) 


= c{uJl)c{uJiUJ2) . . . C{UJI ...UJn) . 


• • 5 


Coo(wi ...UJn 


...) 


= |J{c+(a;i . . . Wn,) / n = 1, 2 . . 


•} 



Even though we are on a rather preliminary level of our construction, we can 
address the general question of defining connectives among pre-conditionals 
already now. In our setting such a connective is indeed a function from 
some power of the space of pre-conditionals into itself. However, to fulfill the 
requirement that a connective should depend solely on the outcomes of its 
arguments (this property is called extensionality in the logic literature), and 
that it should refer to the history, only, the following additional condition 
must be met. 

For any connective a : TQ^ TC^ and any ipi, . . . ,ipn,^p'i, ■ ■ ■ ,iPn € 1*6+, 
v,w G O"*" satisfying ipi{w) = ip[{v) ioi i = 1, . . . ,n holds 

a{ipi, (pn){w) = a{ip[, (Pn){v). 

Note that we permit strong dependence on the history: we do not require the 
connective to depend just on the present values of its arguments, we allow 
it to depend on their whole histories. However, if a particular connective 
a meets the former, stronger requirement, whose formal statement can be 
obtained from the above condition by replacing yS+ by TQ everywhere it 
occurs, we call it a present tense connective. 

Connectives which are not present tense will be called past tense. Any n- 
ary present tense connective of pre-conditionals is fully characterised by a 
mapping 3*^ ^ 3. Note that any connective a, not necessarily present tense 
one, can be completely specified by a mapping IJ^^q 3*>< - • - xS* — > 3. 

n times 

Just like their connectives, pre-conditionals can be present tense, too. A 
pre-conditional c : Q"*" — 3 is called present tense iff c{v) = c{w) holds 
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whenever last-letter-of(u) = last-letter-of(i(;). So indeed a present tense pre- 
conditional is completely determined by a function ^ 3. 

2.2 The formalisms 

Our intention is to distinguish conditionals among pre-conditionals. There- 
fore, in order to deal with them, we need a formalism aimed at dealing with 
sequences of symbols from a finite alphabet. There are many candidates of 
this kind, including regular expressions and their subclasses, grammars of 
various kinds, deterministic or nondeterministic automata, temporal logics, 
first order logic and higher order logics. 

Our choice, which will be carefully motivated later on, is to use three- valued 
counterparts of a certain particular class of finite automata and of past tense 
temporal logic. When the probabilities come into play conditional events of 
a fixed probability space are represented by Markov chains. 
We introduce here briefly the main formalisms used throughout this paper: 
temporal logic, Moore machines and Markov chains. 

2.3 Temporal logic 

Let us first define temporal logic of linear discrete past time, called TL. We 
follow the exposition in tailoring the definitions somewhat towards our 
particular needs. 

The formulas are built up from the set £ (the same set of basic events as 
before), interpreted as propositional variables here, and are closed under the 
following formula formation rules: 

1. Every a G £ is a formula of temporal logic. 

2. ip,ip € TL, then their boolean combinations ip\/ tp -199 are in TL. 
The other Boolean connectives: A, — >, . . . can be defined in terms 
of and V, as usual. 

3. If 99, "0 € TL, then their past tense temporal combinations • (p and 
99 Since "0 are in TL, where • 99 is spelled "previously 93." 

A model of temporal logic is a sequence M = sq, si, . . . , of states, each 
state being a function from £ (the same set of basic events as before) to 
the boolean values {0, 1}. Note that a state can be therefore understood as 
an atomic event from fi, and M can be thought of as a word from $7^. To 
be explicit we declare that the states of M are ordered by < . Rather than 
using the indices of states to denote their order, we simply write s < t to 
denote that a state t comes later than, or is equal to, a state s; similarly 
s + 1 denotes the successor state of s. We adopt the convention that, unless 
explicitly indicated otherwise, a model is always of length n + 1, and thus n 
is always the last state of a model. 
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For every state s of M we define inductively what it means that a formula 
If G TL is satisfied in the state s of M, symbolically M, s \= ip. 

1. M,s^a iff s(a) = 1 
2. 

M, s 1= -.y? : M,s ^ (p, 

M, s \= ip V il! : <^=^> M,s \= (p or M,s \= iIj. 

3. 

M,s \= mip : <S=^ s > and M, s - 1 \= ip; 
M,s ^ Since ^ : {3t < s){M,t \= ip and {\/t < w < s)M,w \= p>). 

The syntactic abbreviations ■ p and ♦ ip are of common use in TL. They 
are defined by ♦ = /a/se Since 99 and Mip = -1 ♦ -199. The first of them is 
spelled "once (/j" and the latter "always in the past ip" . 
Their semantics is then equivalent to 

M,s\=mp: <^ (it < s)M, t \= ip; 
M, s ^ ♦ 93 : {3t < s)M, t \= ip. 

Using the given temporal and boolean connectives, one can write down quite 
complex formulae describing temporal properties of models M, s. We will see 
several such examples in this paper, and even more can be found in pO| ]. 

2.4 Moore machines 

In this section we follow tailoring the definitions, again, towards our 
needs. 

A deterministic finite automaton is a five-tuple 21 = {Q,Q,6,q(),T), where 
Q is its set of states, il. (the same set of atomic events as before) is the input 
alphabet, qq Q is the initial state and 5 : Q x Q ^ Q is the transition 
function. T C Q is the set of accepting states. 

We picture 21 as a labelled directed graph, whose vertices are elements of Q, 
a the function 6 is represented by directed edges labelled by elements of 0,: 
the edge labelled hy to G Q from q G Q leads to 6{q,uj). The initial state is 
typically indicated by an unlabelled edge "from nowhere" to this state. 
As the letters of the input word w € come in one after another, we walk 
in the graph, always choosing the edge labelled by the letter we receive. 
What we do with the word depends on the state we are in upon reaching 
the end of the word. If it is in T, the automaton accepts the input, otherwise 
it rejects it. 
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Formally, to describe the computation of 21 we extend 5 to a function 5 : 
Q X ^ Q in the following way: 



L(2l) C r2+ is the set of words accepted by 2t. 

A Moore machine 21 is a six-tuple 21 = {Q, A, 6, h, go), where {Q, 6, qo) 
is a deterministic finite automaton but the set of accepting states, A is a 
finite output alphabet and h is the output function Q — A. In addition to 
what 21 does as a finite automaton, at each step it reports to the outside 
world the value h{q) of the state q in which it is at the moment. Drawing a 
Moore machine we indicate h by labelling the states of its underlying finite 
automaton by their values under h. In addition, we almost always make 
certain graphical simplifications: we merge all the transitions joining the 
same pair of states into a single transition, labelled by the union (evaluated 
in S) of all the labels. Sometimes we go even farther and drop the label 
altogether from one transition, which means that all the remaining input 
letters follow this transition. 

Formally, a Moore machine computes a function /gt : A+ defined by 

/2i(wia;2 . . . w„) = h{6{qo,uJi))h{5{qo,ujiuj2)) ■ ■ ■ /i((5(go, ^1^2 . . . w„)) 

(note that |/a(wiix'2 ■ ■ . Wn)| = n, as desired), and a function g<^ : A°° 
defined by 



We will be interested in Moore machines which compute 3- valued functions. 
This amounts to partitioning the state set Q of 2t into three subsets T, F,B, 
which we often make into parts of the machine. If we do so, we call the states 
in T the accepting states and the states in F the rejecting states. There will 
be no special name for the states in B. 

A Moore machine 21 is called counter-free if there is no word w G and 
no states qi,q2, ■ ■ ■ ,qs, s > 1, such that 6{qi,w) = q2, ■ ■ ■ ,6{qs-i,'w) = 
qs,6{qs,w) = qi. 

2.5 Markov chains 

For us, Markov chains are a synonym of Markov chains with stationary 
transitions and finite state space. 

Formally, given a finite set / of states and a fixed function p : I x I ^ [0, 1] 




g<^{iOiu;2 



■ ) = |J{/a(wiW2 ...UJn) I n = 1, 2, . . . }. 



satisfying 
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(Vie/) ^p(i,j) = l, (2) 

the Markov chain with state space / and transitions p is a sequence X = 
Xq,Xi, . . . of random variables Xn : W ^ I, such that 

Pr(X„+i=i|X„ = i)=p(i,j). (3) 

The standard result of probability theory is that there exists a probability 
triple (VF, 9Jt, Pr) and a sequence X such that (P) is satisfied. W is indeed 
the space of infinite sequences of ordered pairs of elements from /, and Pr 
is a certain product measure on this set. 

One can arrange the values p{i,j) in a matrix IT = {p{i,j);i,j € /)• Of 
course, p{i,j) > and ^j^iP{hj) = 1 for every i. Every real square ma- 
trix n satisfying these conditions is called stochastic. Likewise, the initial 
distribution of X is that of Xq, which can be conveniently represented by a 
vector Ho = {p{i)',i G I)- Its choice is independent from the function p{i,j)- 
It is often very convenient to represent Markov chains by matrices, since 
many manipulations on Markov chains correspond to natural algebraic op- 
erations performed on the matrices. 

For our purposes, it is convenient to imagine the Markov chain X in another, 
equivalent form: Let Kj be the complete directed graph on the vertex set /. 
First we randomly choose the starting vertex in /, according to the initial 
distribution. Next, we start walking in Kj; at each step, if we are in the 
vertex i, we choose the edge {i,j) to follow with probability p{i,j)- If we 
define X„ = (the vertex in which we are after n steps), then Xn is indeed 
the same X^ as in (^). 

So we will be able to draw Markov chains. Doing so, we will often omit 
edges (i, j) with p{i,j) = 0. 

Classification of states For two states i, j of a Markov chain X with tran- 
sition probabilities p we say that i communicates with j iff there is a nonzero 
probability of eventually getting from i to j. Equivalently, it means that 
there is a sequence i = ii,i2, ■ ■ ■ ,in = j of states such that p{ik,ik+i) > 
for k = l,...,n — 1. The reflexive relation of mutual communication (i.e., 
that i communicates with j and j communicates with i or i = j) is an equiv- 
alence relation on /. Class [i] communicates with class [j] iff i communicates 
with j. 

The relation of communication is a partial ordering relation on classes. The 
minimal elements in this partial ordering are called ergodic sets, and non- 
minimal elements are called transient sets. The elements of ergodic and 
transient sets are called ergodic and transient states, respectively. 
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A Markov chain all whose ergodic sets are one-element is called absorbing, 
and its ergodic states are called absorbing. 

For ergodic sets one can be further define their period. Period of an ergodic 
state i is the gcd of all the numbers p such that there is a sequence i = 
ii,i2, . . . ,ip = i of states such that p{ik,ik+i) > for /c = 1, . . . ,p — 1. It 
can be shown that period is a class property, i.e., all states in one ergodic 
class have the same period. 

An ergodic set is called aperiodic iff its period is 1. Equivalently, it means 
that for every two i,j in this set and all sufficiently large n there exists 
a sequence i = ii,i2, ■ ■ ■ ,in = j of states such that p{ik,ik+i) > for 
= 1, . . . , n — 1. 

Every periodic class C of period p > 1 can be partitioned into p periodic sub- 
classes Ci,...,Cp such that Pr(X„+i G C^+l (modp)l-'^n G Cf, (modp)) = 1 
for all k. 

3 Constructing conditionals 

We make a terminological distinction. If we speak about a conditional object, 
we do not assume any probability space structure imposed on Q. When we 
have such structure (0,S,Pr), we speak about a conditional event, instead. 

3.1 Conditional objects 

First of all, let us note that any TL formula can be understood as a definition 
of a pre-conditional from "PC, which is indeed 2-valued. Indeed, states of any 
model of temporal logic can be interpreted as elements of 17, and the whole 
model is thus an element of fi"*". The value the pre-conditional assigns to 
model M is 1 if M, n \= (p and otherwise. 

We construct a three- valued extension (TL|TL) of TL as the set of all pairs 
{ip\tp) of formulas from TL. The operator (■!•) can be understood as a present 
tense connective of pre-conditionals, and, since formulas of TL are 2-valued, 
it is sufficient to define its action as follows: 



{x\y) 


x\y 





1 


_L 





_L 







1 


_L 


1 




_L 









Definition 1. A conditional object of type i is a pre-conditional c G TQ, 
definable in (TL|TL). The set of such conditional objects is denoted S. 
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Definition 2. A conditional object of type 2 is a pre-conditional c+ G ^PC+j 
such that c+ is computable by a 3-valued counter-free Moore machine. The 
set of such conditional objects is denoted C+. 



Definition 3. A conditional object of type 5 is a pre-conditional Cqo G 7'Qoo, 
such that Coo is computable by a 3-valued counter-free Moore machine. The 
set of such conditional objects is denoted Coo- 

The following proposition says that the conditional objects of types 1, 2 and 
3 are identical up to the way of representing pre-conditionals. 



Theorem 4. 



e+ = {c+ G ye+ / c g e}, 

Coo {^CXD ^ -PCqo / C G C}, 

e = {c G ye / Coo G Soo}. 



Proof. The equalities 6+ = {c+ G ^6+ / Coo G Coo} and Soo = {coo G 
yCoo / c+ G C+} are obvious. What remains to be proven are 6+ = {c+ G 

^6+ / c G e} and e = {c G ye / c+ G e+} 

It is well-known Q that propositional temporal logic of past tense and (fi- 
nite) deterministic automata are of equal expressive power, i.e., in our ter- 
minology, the sets of 2-valued pre-conditionals from IPC definable in TL and 
computable by deterministic finite automata are equal. Indeed the transla- 
tions between temporal logic and automata are effective. 
We start with the first equality. Let c be defined by a (TL|TL) formula {(flip)- 
Let 21 = (Qa, 0,, (^a, ggi) ^a) and OS = {Q^, 0,, 6<b, q<s, Tsg) be deterministic 
finite automata, computing the functions $7"*" — > 2 defined by (p and ip, 
respectively. 

Consider the Moore machine (2t|*B) = (Qai x Q^i ^, 3, 6, h, (g^, Q^)), where 

H{p,q),^) = ('52l(p,'^),^<B(g,'^)), 

1 if p G and q G Ttg, 
Kip, q))= {O Hp^Ta and q G r<8, 
_L otherwise. 

It is immediate to see that (2l|53) computes exactly {if\'ip)+. 
To prove the second equality, let 21 = {Q,^l,3,6,h,qo) be a Moore ma- 
chine computing c^. We construct two deterministic finite automata 2li = 
{Q,n,S,qo,h~^{{l})) and 2I2 = {Q,n,6,qo,h^^{{0,l}) from 21, where 
stands for the co-image under h. Now let ipi and be TL formulae corre- 
sponding to 2ti and 2t2, respectively. 

It is again immediate to see that ((^i|(^2) defines exactly the conditional in 
e computed in 6+ by 21. □ 
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Consequently, we can freely choose between the three available representa- 
tions of conditional objects. Doing so, we regard (TL|TL) to be the logic 
of conditional objects, while Moore machines represent their machine repre- 
sentation. All these representations are equivalent, thanks to Theorem ^. 
The classes C, C+ and Coo represent the semantics of conditional objects, 
and again we can freely choose the particular kind of semantical objects, 
thanks to (|). 

As an example, the simple conditional (a|6) G (TL|TL) is computed by the 
following Moore machine. 



ba 




Figure 1: Moore machine representing conditional object {a\b). 

The above Moore machine, as it is easily seen, acts exactly according to the 
rule "ignore b^'s, decide depending on the truth status of a when b appears" . 
So indeed it represents the repetitions of the experiment for (a|6) according 
to the "bridge" repetition rule start history anew. 

3.2 Conditional events 

We will be using the name conditional events to refer to conditionals con- 
sidered with a probability space in the background. 
Let y(r2), Pr) be a probability space. 

Definition 5 (Conditional event). Let c € C be a conditional object 
over 17. Suppose Q is endowed with a probability space structure (O, E, Pr). 
With c we associate the sequence y = y (c) = Yi , I2 j • • • of random variables 
— > 3, defined by the formula 



Yn{w) = n-th-letter-of(coo(if )), 



(4) 
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where is considered with the product probability structure. 
We call y the conditional event associated with c, and denote it |c], while 
Yn is then denoted [cj^. Note that we do not include the probability space 
in the notation. It will be always clear what (O, Pr) is. 

In particular, Pr(|c]„ = 1) is the probability that at time n the conditional 
is true, Pr([c]„, = 0) is the probability that at time n the conditional is 
false, and Pr(|c] = A.) is the probability that at time n the conditional is 
undefined. 

Definition 6 (Probability of conditional events). 

We define the asymptotic probability at time n of a conditional c by the 
formula 



Pr (c) - = ^) (B) 



If the denominator is 0, Pr„(c) is undefined. 
The asymptotic probability of c is 



Pr(c) = lim Pr„(c), (6) 

n— »oo 

provided that Pr„(c) is defined for all sufficiently large n and the limit exists. 
We will regard [c] as probabilistic semantics of c. 
If (/9 S TL then we write Pr((/9) for Pr(({p\true)). 

It is perhaps reasonable to explain why we want the conditional event and 
its probability to be defined in this way. The main motivation is that we 
want the conditional event and its probability to be natural and intuitive. 
And we achieve this by using the recipe of de Finetti, which in our case 
materializes in the above definitions. 



4 Underlying Markov chains, Bayes' Formula and 
classification of conditional events 

4.1 Underlying Markov chains 

Let c be a conditional object and let 21 = {Q, 5, 3, h, qo) be a counter-free 
Moore machine which computes Cqq. 

We define a Markov chain X = X(2t) by taking the set of states of X to be 
the set Q of states of 21, and the transition function p to be defined by 



p{q,q')= Yl P^(M)- 

5{q,u))=q' 
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Indeed, for every q we have 



Ep(9'9') = E E pr(M) = Ep^(M) = i, 

q' q' ujdVl a;6f! 

5(q,uj)=q' 

which means that the function p satisfies (^) , which is the criterion for being 
a transition probabihty function of a Markov chain. The initial probabihty 
distribution is defined by 




1 if q = the initial state of 21, 
otherwise. 



Therefore we have indeed converted 21 into a Markov chain X. 
In the pictorial representation of the conversion process is much simpler: we 
take the drawing of 21, and replace all the letters from marking transitions 
by their probabilities according to Pr, and then contract multiple transitions 
between the same states into a single one, summing up their probabilities. 

Theorem 7. X is a Markov chain in which only transient and aperiodic 
states exist. 

Proof. Suppose X has a periodic set C of period p > I, and Ci,...,Cp 
its division into periodic subclasses. Let u; G 17 be any atomic event with 
Pr({tj}) > 0. Let q G Ci. Since Pr(X„+i G Ck+i (mod p)\Xn e Ck (mod p)) = 
1 for all k, it follows that 6^{q,u>) = 6{q,uj) € C2 (modp)) and likewise 
6''+^q,u;) = 6{6''{q,^)) G Ck+i (modp) for k > 1. 

However, C is finite, so there must be s 7^ i such that 6^{q,uj) = 6^{u;). 
The sequence 



6'{q,u),d'^\q,u),...,6\q,u)=S'{q,uj) 

thus violates the assumption that 21 is counter-free. □ 

The next corollary follows by the classical result about finite Markov chains. 

Corollary 8. For cvcTy state i of X, the liuiit lirn^_^Qc Fic{Xn = i) exists. 

Using h : Q ^ 3, the acceptance mapping of^, we get 

Theorem 9. Jc] = h{X). □ 

Note that |c] defined above need not be a Markov chain itself, but it is a 
simple projection of a Markov chain, extracting all the invariant information. 
Of course, it will be typically very beneficial to work most of the time with 
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X, having the whole theory of Markov chains as a tool-set, and only then to 
move to |c]. 

Let us examine the previously given definition of {a\b) to see what its prob- 
ability is. 

The Markov chain looks as follows: 



Pr(6a) 




Pr(fea'^) 



Figure 2: Markov chain corresponding to the Moore machine on Fig. |l[ 

where the initial distribution assumes probability 1 given to the state pointed 
to by the arrow "from nowhere" . 

It is easy to check that Pr((a|6)) = Pr(6a)/ Pr(6), provided that Pr(6) > 0. 
Indeed, for every n holds Pr(|(a|6)]„ = 1) = Pr(6a) and Pr([(a|6)]„ = 0) = 
Pr(6a'^), so Pr([(a|6)]„ = or 1) = Pv{ba)+Pv{ba^) = Pr(6). It is so because, 
no matter in which state we are, these are the probabilities of getting to 1 
and in the next step, respectively. This evaluation will follow from Bayes' 
Formula below, too. 

4.2 Bayes' Formula 

First of all, let us note that for each ★ G 3 the limit lim„_»oo Pr([c]n = 
★) exists, since, for any choice of a Moore machine 21 computing c+ and 
assuming X = X(2l), Pr(|c]n = is a sum of Pr(X„ = i) over all states i of 
X with h{i) = and the latter probabilities converge by Corollary ^. 
A conditional event is called regular iff lim^^oo Pr(|c]„ = or 1) > 0. In 
particular, for regular conditionals the limit in (P) always exists and is equal 
to 

limn^oo Pr(|c]n = 1) 
lim^^oo Pr( [c]n = or 1) ■ 
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Turning to the logical representation of conditionals, we have thus 
Theorem 10 (Bayes' Formula). For (iflTp) £ (TL|TL) 



Note that Bayes' Formula has been expected by de Finetti for the frequency 
based conditionals. 

4.3 Classifying conditional events 

It is interesting to consider the conditionals c for which lim„^oo PrlHri = 
or 1) = 0. We can distinguish two types of such conditional events: those 
for which Pr(|c]n = or 1) is identically for infinitely many n, and those 
for which it is nonzero for all but finitely many n. The former will be called 
degenerate, the latter strange. We call strictly degenerate those degenerate 
events, for which Pr(|c]„ = or 1) for all but finitely many n. 
The degenerate conditional events correspond to bets which infinitely of- 
ten cannot be resolved, because they are undefined, and strictly degenerate 
events are those which are almost never defined. 

Strange conditional events are more interesting. The Bayes' Formula is 
senseless for them, so we have to use some ad hoc methods to see if their 
asymptotic probability exists or not. 

The first example shows that the sequence Pr„(c) can be nonconvergent for 
strange c. 

Consider ci = {a\M{{9a —>■ a^) A {9 —>■ a) A • true a))), where 
< Pr(a) < 1. The long temporal formula asserts that a always follows 
and always follows a, and at the beginning of the process (n = 1), where 
• true is false, a holds. 
It is easily verified that 



Thus the finite-time behaviour of this conditional is not probabilistic — its 
truth value depends solely on the age of the system. So for somebody 
expecting a pure game of chances its behaviour must seem strange (and 
hence the name of this class of conditional events). 

Note that we have just discovered the next feature of conditionals expected 
by de Finetti: nonconvergence of the limiting frequency when probability of 
the 'given' part tends to 0. 



Pr((/9 A Tp) 



whenever the right-hand-side above is well-defined. 



□ 
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However, again following de Finetti, if {^p\^p) is strange, its asymptotic prob- 
ability is 1. E.g., Pr(H((» a^)A (• a'^ ^ a) A (-■ • true a))\ ■((• a 
a^) A (• a'^ ^ a) A • true a))) = 1. 
Moreover, for C2 = (a| ■((• a a^) A (• ^ a))) we have 



Prn([c2l) 



1 — Pr(a) if n is even, 
Pr(a) if n is odd. 



n 

Indeed, here the 'given' part requires that a'a and a s alternate, but does not 
specify what is the case at the beginning of the process. So the probability 
of the whole conditional at odd times is the probability that a has happened 
at time 1, and at even times it is the probability that a has not happened at 
time 1. Therefore, when Pr(a) = 1/2, Pr(c2) exists and is 1/2. So asymptotic 
probabilities which are neither nor 1 are possible for strange conditionals 
events. 

At present, the question whether there it is decidable if a given strange con- 
ditional event has an asymptotivc probability is open. However, we believe 
that te answer is positive and offer it as our cojecture. 

Conjecture 1. The set of conditional events which have asymptotic proba- 
bility is decidable. Moreover, for those events which have asymptotic proba- 
bility, its value is effectively computable. 



5 Connectives of conditionals 

5.1 Present tense connectives 

Let us recall that present tense connectives are those, whose definition in 
(TL|TL) does not use temporal connectives, and therefore depends on the 
present, only. Equivalently, an n-ary present tense connective is completely 
characterised by a function 3" — > 3. 

Here are several possible choices for the conjunction, which is always defined 
as a pointwise application of the following 3 valued functions. Above we 
display the notation for the corresponding kind of conjunction. 



X AsAc y 


x\y 


1 _L 








1 


1 1 


± 


1 _L 



X Agnw y 


x\y 


1 _L 








1 


1 _L 


_L 


_L _L 



X Asch y 


x\y 


1 


_L 








_L 


1 


1 


_L 


_L 


_L _L 


_L 
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~ X 


X 


~ X 





1 


1 





_L 


_L 



X VsAc y 


x\y 


1 


_L 





1 





1 


1 1 


1 


_L 


1 


_L 



X Vgnw y 


x\y 


1 _L 





1 1 


1 


1 _L 1 


_L 


_L _L _L 



X Vsch y 


x\y 


1 


_L 





1 


_L 


1 


1 1 


_L 


_L 


_L _L 


_L 



They can be equivalently described by syntactical manipulations in (TL|TL). 
The reduction rules are as follows: 



{a\b) AsAC {c\d) = {abed V abd^ V cdb^\b V d) 
(a 1 6) Aqnw {c\d) = {abcd\a d V c d V abed) 
(a 1 6) Asch {c\d) = {abed\bd) 

~ (a|fe) = {a^\b) (7) 
{a\b) VsAC {c\d) = {ab V ed\b V d) 
(a 1 6) Vgnw {c\d) = {ab V ed\ab V cd V bd) 
{a\b) Vsch {c\d) = {ab V ed\bd). 

The first is based on the principle "if any of the arguments becomes defined, 
act!". A good example would be a quotation from 

"One of the most dramatic examples of the unrecognised use 
of compound conditioning was the first military strategy of our 
nation. As the Colonialists waited for the British to attack, 
the signal was 'One if by land and two if by sea'. This is the 
conjunction of two conditionals with uncertainty!" 

Of course, if the above was understood as a conjunction of two conditionals, 
the situation was crying for the use of AsaCj whose definition has been 
proposed independently by Schay, Adams and Calabrese (the author of the 
quotation) . 

The conjunction Agnw represents a moderate approach, which in case of an 
apparent evidence for reports 0, but otherwise it prefers to report unknown 
in a case of any doubt. Note that this conjunction is essentially the same as 
lazy evaluation, known from programming languages. 

Finally, the conjunction Asch is least defined, and acts (classically) only if 
both arguments become defined. It corresponds to the strict evaluation. 
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We have given an example for the use of Asac- The uses of Agnw and Asch 
can be found in any computer program executed in parahel, which uses 
either lazy or strict evaluation of its logical conditions. And indeed both of 
them happily coexist in many programming languages, in that one of them 
is the standard choice, the programmer can however explicitly override the 
default and choose the other evaluation strategy. 

Let us mention that all the three systems above are in fact well-known, 
classical so to say three-valued logics: (Agnw, Vgnw> ~) is the logic of 
Lukasiewicz, (AsaCi VsaCj ~) is the logic of Sobocihski, and (Asch) Vschi ~) 
is the logic of Bochvar. 



5.2 Past tense connectives 

The following connective is tightly related to very close to the conjunction of 



the product space conditional event algebra introduced in |11]. Detailed dis- 
cussion of embeddings of existing algebras of conditional events into (TL|TL) 
is included in the companion paper [^. Our new conjunction, denoted A*, 
is defined precisely when at least one of its arguments is defined, so it re- 
sembles AsAC in this respect, but instead of assigning the other argument 
a default value when it is undefined, like SAC does, it uses its most recent 
defined value, instead. However, when the other argument hasn't ever been 
defined, it is assumed to act like false. 

In the language of (TL|TL) {a\b) A* {c\d) can be expressed by 
((6^ Since(a A b)) A Since(c A d))\b V d). 



5.3 Conclusion 

We believe that there is no reason to restrict our attention to any partic- 
ular choice of an operation extending the classical conjunction, and call is 
the conjunction of conditionals. There are indeed many reasonable such ex- 
tensions, which correspond to different intuitions and situations, they can 
coexist in a single formalism, and any restriction in this respect necessarily 
narrows the applicability of the formalism. 

We believe that neither of the choices discussed in this paragraph is the 
conjunction of conditionals. There are indeed many possible choices, and 
all of them have their own merits. In fact already the original system of 
Schay consisted of five operations: ^, AsaC; VsaCj ^Sch and Vsch- Moreover, 
he was aware that these operations still do not make the algebra functionally 
complete (even in the narrowed sense, restricted to defining only operations 
which are undefined for all undefined arguments). And in order to remedy 
this he suggested to use one of several additional operators, one of them 
being Aqnw' So for him all those operations could coexist in one system. 
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6 Three prisoner's puzzle 

In order to demonstrate that our formalism allows for a precise treatment of 
problems with conditioning and probabilities, let us consider the following 
classical example of a probabilistic "paradox" . We will take this opportunity 
to highlight some of the practical issues of modelling using (TL|TL) and 
Moore machines approach. Therefore our analysis will be very detailed. 



6.1 The puzzle 



The three prisoner's puzzle [26| is the following: 



Three prisoners are sentenced for execution. One day before 
their scheduled execution, prisoner A learns that two of them 
have been pardoned. A calculates a probability of 2/3 for him 
being pardoned. Then he asks the Guard: "Name me one of 
my fellows who will be pardoned. The Guard tells him, that B 
will be pardoned. Based on that information, A recalculates the 
probability of being pardoned as 1/2, since now only one pardon 
remains for him and C (the third prisoner) to share! However, 
he could apply the same argument if the Guard had named C. 
Furthermore, he knew beforehand that at least one of his fellows 
will be pardoned — so what did he gain (or lose) by the answer? 

The intuitive explanation is that after learning the Guard's testimony G{B) 
that B will be pardoned, A should revise the probability of the event P{A) 
(of him being pardoned) by computing P{9 P{A)\G{B)), and the probability 
evaluation yields in this case 2/3, as expected. 

However, what he indeed calculated was P{9 G{B)\P{A)), assuming effec- 
tively that the pardon had been given with equal probabilities to all the 
pairs possible after Guard's testimony. This probability turns out to be 
1/2. 



6.2 Probability tree model 

First we present a simple probability tree analysis of the paradox, using the 



method which originates with Huygens |19, ^] and is indeed almost as old 
as the mathematically rigorous probability theory itself. We begin in the 
leftmost circle (before pardon) , then each of the three pardoned pairs leads 
us to three next circles, indicating the situation after the pardon. Finally, 
we have all the possible testimonies of the Guard. All edges originating 
from the same circle are equiprobable. After Guard's testimony G{B), only 
the two top circles on the right are possible, and their probabilities are in 
the proportion 2:1, the more probable one being the one in which A is 
pardoned, while he is executed in the other one. So indeed even after the 
testimony the probability that A is pardoned remains 2/3. 
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Figure 3: Probability tree analysis of the three prisoner puzzle. 
6.3 (TL|TL) and Moore machine models 

However, the tree shown above strongly resembles a Moore machine. And in- 
deed, we augment it with the necessary details below. The most substantial 
change is that the Moore machine requires the same set of atomic possibili- 
ties is given at each state, which determine the next transition. Therefore: 

• The Guard testifies something irrelevant while the court decides the 
pardons, and the court decides something irrelevant while the Guard 
testifies. This change is made invisible by our convention of collapsing 
transitions and applying subsequently Boolean algebra simplifications, 
except that 

• In cases when the Guard has no choice, we must replace the existing 
transition label by the full event, because the Guard has prescribed 
answer no matter whom he would like to name, 

• And except that we have to decide about transitions from the states 
which are terminal in the tree model. Because we believe that after 
being pardoned nobody can be prosecuted again for the same crime, 
and we do not believe in reincarnation, either, our choice is to use 
self-loops in the terminal states, yielding a "Russian roulette" model. 

This provides a next piece of evidence that our definition of conditional 
events is natural and close to intuitions. In fact, one can embed the whole 
probability tree model into the formalism of Russian roulette Markov chains 
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Figure 4: Probability tree analysis of the three prisoner puzzle with exten- 
sions necessary to convert the diagram into a Moore machine. 



[28 1, and thus shows that our model of conditionals extends the method of 
probability trees. 

Next we attempt to model the same paradox syntactically in (TL|TL). The 
construction of a correct (TL|TL) representation is a little bit more com- 
plicated than the formula P{9 P{A)\G{B)) we have suggested previously, 
as this requires specifying the actions of the Guard, whose probabilities are 
affected by the pardon decision. So we assume that the Guard always tosses 
a coin. If he gets heads (H), he tells the alphabetically first name among 
those applicable, and in case of tails (T) the alphabetically last among them. 
This indicates the need to consider the strategy followed by the Guard. And 
in fact, the probabilities A calculates depend on what he assumes about this 
strategy. So indeed now the answers of the Guard are shorthands for the 
combinations of the pardon decision and the coin toss outcome. Therefore 
G{B) is (• P{AB) A (if V T)) V (• P{BC) A H). 

Moreover, we have to decide what should be modelled by the conditional 
object, and what by the probability assignment, which turns the former 
into a stochastic process. The general rule is that the more of the modelling 
is encoded in the probability assignment, the simpler the conditional and 
its Moore machine are. On the other hand, encoding everything in the 
probability distribution is difficult and prone to errors, as the example of 
the poor prisoner shows. An, needless to say, a good model is one in which 
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the proportions are just right. More on that below. 
So formally the conditional looks now as follows: 

{{•AB) V {•AC) !((• AS) A (HyT)) V {{•BC)AH)) , (8) 

with £ = {PiAB), P{BC),AC, H, T}, where the events P{AB), P{BC) and 
AC mutually exclusive and equiprobable, and similarly H and T mutually 
exclusive and equiprobable. (Our construction will easily handle non-equal 
probabilities, i.e., biased pardon decision and/or biased coin, too.) So the set 
n of atomic events is {AB H, AB T, BC H, BC T, AC H, AC T}, and these 
events are equiprobable under our probability assignment. However, we 
will be able to calculate the probability of (P) without the equiprobability 
assumption, too. 

Note that, e.g., assuming events A,B and C to be nonexclusive individual 
pardon decisions of probability 1/3 each, leads to more complicated condi- 
tional expression, because a substantial amount of coding effort must used 
just to ensure that always precisely two prisoners are pardoned. This makes 
the Moore machine more complicated, too. So this is certainly not a good 
model, because what can be easily taken care of by the probability assign- 
ment is instead modelled by logical methods. Such a model can be of course 
correct]^ but good means for us more than just correct. 
But if we attempt to draw the Moore machine of our conditional, we discover 
that it is quite different from that on Fig. ^. 

The overall structure of the Moore machine is as follows: The entry states 
and transitions are dotted. Each of the three lines of three states (they form 
roughly edges of a triangle), consists of states with the same, already known 
pardon decision in the next experiment, while the current experiment's out- 
come is represented as the label of the state. Transitions are shown for one 
state on each edge only, because their targets depend on the input only, and 
not on the source within that edge. And this is why we can calculate the 
probability of (|8|) in a quite straightforward way. For time greater than 1 
the probability of getting in two steps to a state with a given label does not 
depend on the current state nor on the time. Essentially, after the first step 
the edge of the triangle is chosen, which corresponds to the move to one of 
the states in the middle column of Fig. ^. In the second step we move to the 
state with the label equal to the destination label from Fig. and the edge 
it is found within depends on the next experiment, already. The similarity 
is even stronger if we compare Fig. |5| with Fig. |9| rather than with Fig. ^. A 
formal calculation, using matrix calculus, can be found in Section |6.4| below. 
The most substantial difference is that (^) is not a "Russian roulette" model! 
To note this set time to 3 and see: the present outcomes depend on the 

^Although unnecessary complications certainly increase the risk of mistakes and make 
verification of the model harder. 
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pardon decisions made at time 2, while the Guard was testifying in the 
previous round of the experiment, and while we are hearing the testimony 
of the Guard now, the pardons are already decided as a part of the next 
experiment. So the probabilistic choices which we described as irrelevant for 
the Moore machine model, are parts of the previous/next repetition schema 
here. The overlapping experiments do not interfere, however, so this does 
not affect probabilities. Furthermore, all the final outcome undefined values 
have been merged into one state. Finally, there are entry states which are 
visited just once and correspond to the situation at time 1, when the Guard 
says something, but there is no pardon decision to compare it with. 
A modified version of (P), which is Russian roulette, is as follows: 

{{@iAB) V {@iAC) \{{@iAB) A @2{H V T)) V {{@iBC) A ©2^^) ) , (9) 

where @ia is ♦(-! • true A q) and @2a is ♦(• true A • • true A a), and 
express that a is true at time 1 and 2, respectively. 




Figure 6: Moore machine of dH). It is the minimalization of the Moore 
machine from Fig. ^, so they are indeed logically indistinguishable. 

The general conclusion is that simple Moore machines can correspond to 
complicated (TL|TL) formulas, and simple (TL|TL) descriptions can yield 
complicated Moore machines. If we additionally take into account that it is 
hard to expect that any computer program will be ever able to transform 
human-readable representations of one kind into human-readable representa- 
tions of the other kind^, we recommend that the whole process of modelling 

^In both cases even graphical layout can have a huge impact on the readability of the 
model! 
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is done using only one of the formalisms, without mixing them. 

6.4 Algorithm for calculating the probability 

Of course, the natural method to compute probability of a given regular 
conditional c in our model is to refer to an underlying Markov chain X, 
perform the computations there, and then use the formula 



Pr(c) — 



^i:h{i)=l or l™n-»oo Pr(-'^n — i) ' 

which follows directly from the Bayes' Formula. 

The calculation of lim^^oo Pr(^n = i) is generally known to be polynomial 
time in the number of states of the Markov chain, assuming unit cost of 



arithmetical operations |2C]. The book [g9|] contains the account of state-of- 
the-art algorithms for numerical calculations of the limiting probabilities. 
As an example we calculate here the probability of the formula dH), using 
the simplest possible approach, assuming that all the events from Q have 
nonzero probability. 

We assume the following numbering of the states of the Markov chain from 
Fig.g 






2 1 (6 





Figure 7: Numbering of the states of Markov chain resulting from the Moore 
machine in Fig. ^. 

Then the matrix 11 of transition probabilities is 
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where AB stands for Pr{AB), and similarly for arguments BC, AC, H,T 
(the matrix does not fit into the page when the standard notation is used). 
It can be directly checked that the square of this matrix has all entries posi- 
tive, hence the whole represents a single ergodic class. (This is what breaks 
down when some elements from Q have probability 0. It this is permitted, 
one has to consider a few more cases.) It is known that in such cases the 
limiting probability does not depend on the initial probabilities of getting 
into this class, therefore we can ignore the dotted (transient) states from 
Fig. |5|. The limiting probabilities can be found, given 11 = (pij), by finding 
the only solution of the system of linear equations 

Z]i=i Xi = 1, 
Yn^iPiiXi = xi, 

' TH=lPi2Xi = X2, 
YA=lPi9^i = ^9) 

which yields the following unique solution: 



xi = AB^ X2 = BCABH = AB{1 - AB - BC H) 
X4 = AB BC X5 = BC"^ H xq = B~C{1- AB - BC H) 
X7 = AC AB xs = BC ACH xg = 1 - AB{1 + AC + BC H) - BC 

and the asymptotic probability of the conditional represented by the Moore 

1 . ■ ■ • Pr(AS) 

machme m question is ^ , ^ , ^^-^ ^ , . ^ as expected. In particular, 

Vt:{BC)Vt{H) +Vr{AB) 

in the equiprobable case the value is 2/3. 



7 Related work and possible extensions 

7.1 Related work 

• Using temporal logic in reasoning about knowledge is nothing new. 
Indeed, many logics of knowledge incorporate temporal operators, see 
1^]. However, to the best of our knowledge, (TL|TL) is the very first 
multi-valued temporal logic to be considered. In particular, the above 
mentioned logics of knowledge are two-valued. Moreover, (TL|TL) is 
the first natural use of past tense temporal logic in computer science. 
Most of the established formalisms which use propositional temporal 
logic, indeed use its future tense fragment. 
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• Computing of conditional probabilities Pr((/9|'i/') is not new, either, and 
has been considered by several authors, including ||2^, mostly 
for first order logic of unordered structures. 

• Finally, Markov chains have already been used for evaluation of prob- 
abilities of logical statements. In particular, our Bayes' Formula is 
a simple extension of a theorem of Ehrenfeucht (see [^]), phrased 
there as a theorem about first order logic of ordered unary structures 
(over which first order logic is equally as expressive as propositional 
temporal logic, see [^). 

7.2 Possible extensions. 

• (TL|TL) is not closed under its own connectives, since the nesting of 
the conditioning operator (-j-) with other connectives (let alone itself) 
is not allowed, and since the temporal connectives cannot be applied 
to a conditional pair. As a consequence, operations on conditionals are 
defined by disassembling the pairs and reassembling them afterwards, 
to yield a pair in the correct syntactical form again. 

We would like to have an equivalent logic with much better syntactical 
structure. This should be possible by extending the ideas of multival- 
ued modal logics, investigated in |^], by a multivalued counter- 
parts of Since . The logic would then assume the form of a propositional 
logic with multivalued temporal connectives and conditioning. 

The big question is whether one can retain the Bayes' Formula then. 
The existing attempts in the present tense logics of conditionals sug- 
gest it might be difficult. 

• (TL|TL) does not match exactly the class of automata, which for any 
assignment of probabilities yield a Markov chain with all states either 
transient or aperiodic. In such Markov chains all the limiting probabil- 
ities do exist, and thus every such Markov chain can be meaningfully 
considered to represent an extended kind of a conditional. Indeed, 
below is a simple example of such an automaton. 

We would like to have an extension of (TL|TL), matching exactly the 
class of Markov chains with only transient and aperiodic states, to 
take the advantage of the maximal class of Markov chains for which 
the limiting probabilities exist, and thus all the definitions given in the 
paper make sense. We expect the logic to be obtained by extending 
the multivalued temporal logic proposed suggested above, rather than 
by extending the present syntax. 

Acknowledgement. The first author wishes to thank Igor Walukiewicz 
for valuable informations concerning temporal logic. 
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Figure 8: It is not hard to verify that, no matter what probabihty is assigned 
to the event a, the resulting Markov chain has only transient and acyclic 
states. However, the automaton is not acyclic, since it has two states, reach- 
able by a path labelled aa^ from each other. 
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